Merge with `main` again by alexcrichton · Pull Request #112 · bytecodealliance/wasip3-prototyping

alexcrichton · 2025-04-07T21:11:09Z

Trying to reduce conflicts from version bumps...

… (#10502) * Cranelift: remove return-value instructions after calls at callsites. This PR addresses the issues described in #10488 in a more head-on way: it removes the use of separate "return-value instructions" that load return values from the stack, instead folding these loads into the semantics of the call VCode instruction. This is a prerequisite for exception-handling: we need calls to be workable as terminators, meaning that we cannot require any other (VCode) instructions after the call to define the return values. In principle, this PR starts simply enough: the return-locations list on the `CallInfo` that each backend uses to provide regalloc metadata is updated to support a notion of "register or stack address" as the source of each return value, and this list is now used for both kinds of returns, not just returns in registers. Shared code is defined in `machinst::abi` used by all backends to perform the requisite loads. In order to make this work with more defined values than fit in registers, however, this PR also had to add support for "any"-constrained registers to Cranelift, and handling allocations that may be spillslots. This has always been supported by RA2, but this is the first time that Cranelift uses them directly (previously they were used only internally in RA2 as lowerings from other kinds of constraints like safepoints). This requires encoding a spillslot index in our `Reg` type. There is a little bit of complexity around handling the loads/defs as well: if we have a return value on-stack, and we need to put it in a spillslot, we cannot do a memory-to-memory move directly, so we need a temporary register. Earlier versions of this PR allocated another temp as a vreg on the call, but this doesn't work with all calling conventions (too many clobbers). For simplicity I picked a particular register that is (i) clobbered by calls and (ii) not used for return values for each architecture (x86-64's tailcall needed to lose one return-in-register slot to make this work). This removes retval insts from the shared ABI infra completely. s390x is different, still, because it handles callsite lowering from ISLE; we will need to address that separately for exception support there. * Fix is_included_in_clobbers on aarch64: new defs must skip optimization. * Review feedback: add assert. * Review feedback: handle retval temp reg via ABI trait method. * Update is_clobbered_in_inst to affect only clobbers, not all defs.

Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>

…k` module to the top level of the crate (#10517) Split off from bytecodealliance/wasmtime#10503

Currently the `wast` test suite does a combinatorial matrix of `compiler x pooling x allocator` to produce the list of tests to run. We also have a `wast_tests` fuzzer, however, which configures even more knobs and run tests under these configurations. In the interest of saving time on CI and avoiding too much combinatorial explosion this commit changes the `wast` test suite to avoiding combinatorics and instead running in only "interesting" combinations. The goal here is to weed out common issues on CI quickly, such as compiler-specific behaviors, but leave the full spectrum of combinatorial tests to fuzzing. For example we probably aren't getting much extra coverage running Winch with the pooling allocator over running Cranelift with the pooling allocator. This reduces the number of tests from ~4500 to ~2500 which should hopefully lead to some minor improvements in CI time.

This commit implements a workaround for rust-lang/rust#139487 which is necessary to have these implementations continue to work on more recent compilers due to the underlying implementation changing.

These tests assume that a very large allocation will fail due to exceeding address space limits. This is not the case on s390x, where the full 64-bit address space is available for allocation. (Of course, actually using that much memory would still fail at some point, but the mere allocation may not, depending on the kernel's memory overcommit handling configuration.) Note that running the tests under qemu will cause the expected failure, which is why this problem is not noticed by the CI. Simply disable these test on s390x.

This fixes the Pulley ABI code to only have a tail-call-specific adjustment on the tail ABI, not all ABIs. This matches what other backends such as riscv64 do as well. This is not exposed through wasm I believe because no native signature can exercise this code (I think) and all wasm code uses the `tail` ABI anyway. This was discovered by running Cranelift filetests using ASAN which is something I hope to be able to add to our CI in the near future.

* Support adding additional capacity to `FreeList`s Split out from bytecodealliance/wasmtime#10503 * Add a test for `FreeList::add_capacity` * Also test when old capacity is not a multiple of our alignment * Add comments about alignment-rounding and free list capacity * Don't run little example as a doc test

…nt` (#10526) And let alias analysis dedupe and clean up multiple uses, instead of trying to do that manually in our Wasm-to-CLIF frontend by loading the pointer once in the entry block and trying to keep track of whether it is actually necessary to pre-declare the pointer and all that. Split off from bytecodealliance/wasmtime#10503

* Cranelift: update to regalloc2 0.11.3. This pulls in a fix for a fuzzbug found after #10502 started generating more challenging constraints for regalloc. The fix in bytecodealliance/regalloc2#214 updates bundle-splitting logic to properly handle bundles with multiple live-ranges all covering one instruction. * Update test expectations after regalloc perturbation.

…531) This is a follow-on to bytecodealliance/wasmtime#10502 implementing the same logic for s390x. Like other platforms, we now no longer emit any instructions handling return values after the call instruction; instead, everything is done within a pseudo Call instruction. Unlike other platforms, this also has to handle vector lane swapping when calling between ABIs that mix BE and LE lane orders. The pseudo Call instruction needs enough type information to make this choice, therefore I had to add a type field to the RetLocation::Reg case (the ::Stack case already had one). All other changes are contained within the s390x back-end. To simplify the implementation, the patch also implements a number of clean-ups: - Introduce a MemArg::SpillOffset variant - Remove the (unnecessary on this platform) island code size check - Merge the CallInd instructions into the base Call instructions, using a new CallInstDest type to carry the call destination

…ent` (#10541) * Stop returning a `WasmResult` from the bounds-checking code; it is actually infallible and we were just plumbing around results for no reason. * Split out a few small helper methods from `FuncEnvironment::make_heap`. This allows for reusing them in bytecodealliance/wasmtime#10503

* Apply argument extensions to component intrinsics too This commit fixes the definition of host signatures for component builtins to work in the same way that core builtins do which is to use platform-specific sign-extension-rules by default. This fixes a regression found in the wasip3-prototyping repository where new component intrinsics didn't have this set and subsequently were failing in riscv64 CI. * Add some tests

cfallin and others added 14 commits April 5, 2025 05:01

Bump Wasmtime to 33.0.0 (#10525)

e06fe2c

Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>

Move the `wasmtime_cranelift::translate::code_translator::bounds_chec…

801cd06

…k` module to the top level of the crate (#10517) Split off from bytecodealliance/wasmtime#10503

Fix float math implementations on Rust 1.87+ (#10534)

fe3000d

This commit implements a workaround for rust-lang/rust#139487 which is necessary to have these implementations continue to work on more recent compilers due to the underlying implementation changing.

Merge remote-tracking branch 'upstream/main'

0628441

alexcrichton enabled auto-merge April 7, 2025 21:11

alexcrichton added this pull request to the merge queue Apr 7, 2025

Merged via the queue into bytecodealliance:main with commit aa801ba Apr 7, 2025
79 checks passed

alexcrichton deleted the another-merge branch April 7, 2025 21:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge with `main` again#112

Merge with `main` again#112
alexcrichton merged 14 commits into
bytecodealliance:mainfrom
alexcrichton:another-merge

alexcrichton commented Apr 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

alexcrichton commented Apr 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants